November 3, 2025
Even though social development is qualitatively different than cognition, it is natural to compare these scores
Data are scored using Rasch family models
Between-items multidimensionality
Dimensions “hang together”
Unifying higher-order dimension
Reference dimension (Ackerman, 1992)
Weighted average of dimensions
\[ \text{logit}(P_{ij}) = \theta_j - \delta_i \]
\(P_{ij}\): probability of a keyed response for person \(j\), item \(i\)
\(\theta_j\): person \(j\)’s location (ability)
\(\delta_i\): person \(i\)’s location (difficulty)
\[\text{logit}(P_{ij}) = \alpha(\theta_j - \delta_i)\]
\[\text{logit}(P_{i(d)j}) = \alpha_d(\theta_{dj} - \delta_{i(d)})\]
\(i(d)\): item \(i\) belonging to dimension \(d\)
\(\alpha_d\): might change across dimensions
Under the unidimensional Rasch model, items can be uniquely ordered
\(\delta_i\) is ordered according to the item’s difficulty
\(\hat{p}_i\) (the proportion of correct responses to item \(i\)) is a sufficient statistic for \(\delta_i\)
\(\hat{p}_i\) and \(\hat{\delta}_i\) are related by a unique bijective function
| ConQuest | TAM | 1PL | |
|---|---|---|---|
| Latent Means | Estimate \(\mu_1\), \(\mu_2\) | Set \(\mu_1 = \mu_2 = 0\) | Set \(\mu_1 = \mu_2 = 0\) |
| Item Difficulties | Set \(\sum\hat{\delta}_{i(d)}=0\) | Estimate \(\delta_{i(d)}\) | Estimate \(\delta_{i(d)}\) |
| Latent Variances | Estimate \(\phi_{11}\), \(\phi_{22}\) | Estimate \(\phi_{11}\), \(\phi_{22}\) | Set \(\phi_{11} = \phi_{22} = 1\) |
| Item Steepnesses | Set \(\alpha_{d}=1\) | Set \(\alpha_{d}=1\) | Estimate \(\alpha_d\) |
1000 responses from Australian students to 12 multiple-choice TIMSS items
These data are described in chapters 3, 9 of the ConQuest manual
6 items measure math ability
6 items measure science ability
Wu, M., Adams, R., Wilson, M., & Haldane, S. (2007). ACER ConQuest: Generalised item response modelling software (Version 2.0) [computer software]. Melbourne, Australia: ACER.
Transform \(\alpha_d\), \(\theta_{dj}\), \(\delta_{i(d)}\) to \(\tilde{\alpha}_d\), \(\tilde{\theta}_{dj}\), \(\tilde{\delta}_{i(d)}\)
Scaling constant \(r_d\): changes the slope of dimensions
Shift constant \(s_d\): changes the location of dimensions
\[\tilde{\alpha}_d = \frac{\alpha_d}{r_d}\]
\[\tilde{\theta}_{dj} = r_d \theta_{dj} + s_d\]
\[\tilde{\delta}_{i(d)} = r_d \delta_{i(d)} + s_d\]
Original latent mean and variance: \(\mu_d\), \(\sigma^2_d\)
Transformed latent mean and variance: \(r_d \mu_d + s_d\), \(r^2_d \sigma^2_d\)
As \(\hat{p}_{i(d)}\) increases, the corresponding \(\delta_{i(d)}\) decreases
Theoretical definition:
Scales are aligned if the same sufficient statistic implies the same parameter estimate, regardless of dimension
Practical evaluation:
Scales are aligned if the absolute rank-order correlation between \(\hat{p}_{i(d)}\) and \(\hat{\delta}_{i(d)}\) (combining dimensions) equals 1
Analogy: factor rotation in exploratory factor analysis
Reference Dimension Approach
Find item ordering by fitting all items to a unidimensional model
Delta dimensional alignment (DDA)
Originally described by Schwartz & Ayers (2011), others
Functional Approach
Estimate the functional relationship between \(\hat{p}_{i(d)}\) and \(\hat{\delta}_{i(d)}\)
Logistic regression alignment (LRA)
Fit data to a unidimensional model \(\mathcal{U}\): \(\hat{\boldsymbol{\delta}}_{\mathcal{U}d}\)
Fit data to a between-items multidimensional model \(\mathcal{M}\): \(\hat{\boldsymbol{\delta}}_{\mathcal{M}d}\)
For each dimension \(d\), find the transformation parameters \(\hat{r}_d\) and \(\hat{s}_d\) such that \(\hat{\boldsymbol{\delta}}_{\mathcal{U}d}\) and \(\hat{\boldsymbol{\delta}}_{\mathcal{M}d}\) have the same mean (\(\text{mn}\)) and standard deviation (\(\text{sd}\))
\[\begin{align*} \hat{r}_d & = \frac{\text{sd}(\hat{\boldsymbol{\delta}}_{\mathcal{U}d})}{\text{sd}(\hat{\boldsymbol{\delta}}_{\mathcal{M}d})} \\ \hat{s}_d & = \text{mn}(\hat{\boldsymbol{\delta}}_{\mathcal{U}d}) - \hat{r}_d\text{mn}(\hat{\boldsymbol{\delta}}_{\mathcal{M}d}) \end{align*}\]
\[\begin{align*} \hat{r}_d & = \frac{\text{sd}(\hat{\boldsymbol{\delta}}_{\mathcal{U}d})\text{sd}(\hat{\boldsymbol{\delta}}_{\mathcal{M}1})}{\text{sd}(\hat{\boldsymbol{\delta}}_{\mathcal{M}d})\text{sd}(\hat{\boldsymbol{\delta}}_{\mathcal{U}1})} \\ \hat{s}_d & = \frac{\text{mn}(\hat{\boldsymbol{\delta}}_{\mathcal{U}d}) - \text{mn}(\hat{\boldsymbol{\delta}}_{\mathcal{U}1}) + \frac{\text{sd}(\hat{\boldsymbol{\delta}}_{\mathcal{U}1})}{\text{sd}(\hat{\boldsymbol{\delta}}_{\mathcal{M}1})}\text{mn}(\hat{\boldsymbol{\delta}}_{\mathcal{M}1}) - \frac{\text{sd}(\hat{\boldsymbol{\delta}}_{\mathcal{U}d})}{\text{sd}(\hat{\boldsymbol{\delta}}_{\mathcal{M}d})}\text{mn}(\hat{\boldsymbol{\delta}}_{\mathcal{M}d})}{\text{sd}(\hat{\boldsymbol{\delta}}_{\mathcal{U}1}) / \text{sd}(\hat{\boldsymbol{\delta}}_{\mathcal{M}1})} \end{align*}\]
\[\text{logit}(P(x_{ij} = 1 \vert \hat{\delta}_{i(d)})) = \hat{\gamma}_{0d} + \hat{\gamma}_{1d}\hat{\delta}_{i(d)}\]
\[\begin{align*} \hat{r}_d & = \frac{\hat{\gamma}_{1d}}{\hat{\gamma}_{11}} \\ \hat{s}_d & = \frac{\hat{\gamma}_{0d} - \hat{\gamma}_{01}}{\hat{\gamma}_{11}} \end{align*}\]
Goals:
What factors affect alignment accuracy?
Any performance differences between DDA and LRA?
Factors manipulated:
Sample size: {200, 500, 1000}
Number of items per dimension: {5, 20}
Mean of \(\theta_2\): {0, .5, 1} (\(\theta_1\) was standard normal)
SD of \(\theta_2\): {.5, 1, 1.5, 2}
Skewness of \(\theta_2\): {0, .75}
Kurtosis of \(\theta_2\): {0, 3}
Correlation b/n \(\theta_1\) and \(\theta_2\): {0, .3, .6 .9}
20 replications
Does alignment work?
| Before Alignment | After DDA | After LRA | |
|---|---|---|---|
| Median \(\tau(\hat{\boldsymbol{p}},\hat{\boldsymbol{\delta}})\) | \(.987\) | \(1.000\) | \(.997\) |
| \(\%\tau(\hat{\boldsymbol{p}},\hat{\boldsymbol{\delta}})=1\) | \(23\%\) | \(55\%\) | \(55\%\) |
Does DDA or LDA work better?
When is alignment most useful?
Criterion: \(\tau(\hat{\boldsymbol{p}},\hat{\boldsymbol{\delta}})\) before alignment
Predictors: All manipulated factors + interactions
Multiple \(R^2 =.38\)
| Source | df | \(F\) | \(\eta^{2}\) |
|---|---|---|---|
| \(\text{sd}(\theta_{2})\) | 3 | 8483 | \(.272\) |
| \(\text{kurt}(\theta_{2})\) | 1 | 698 | \(.007\) |
| \(n_{1}\times n_{2}\) | 1 | 3397 | \(.036\) |
| \(\text{sd}(\theta_{2})\times\text{kurt}(\theta_{2})\) | 3 | 300 | \(.009\) |
| \(n_{1}\times n_{2}\times\text{sd}(\theta_{2})\) | 3 | 471 | \(.015\) |
\(\tau(\hat{\boldsymbol{p}},\hat{\boldsymbol{\delta}})\) higher when \(\text{sd}(\theta_{2})\) closer to 1
\(\tau(\hat{\boldsymbol{p}},\hat{\boldsymbol{\delta}})\) slightly higher with higher \(\text{kurt}(\theta_{2})\)
More likely to have perfect alignment initially if \(n_{1}=n_{2}=5\)
What is the relationship between \(\hat{r}_2\), \(\hat{s}_2\) and the latent trait distributions?
\(\theta_1 \sim N(01)\)
mean \(\theta_2 \in \{0, .5, 1, 1.5, 2\}\)
standard deviation \(\theta_2 \in \{.5, 1, 1.5, 2\}\)
skewness \(\theta_2 \in \{0, .5, 1, 1.5, 2\}\)
kurtosis \(\theta_2 \in \{-2, -1, 0, 1, 2, 3\}\)
Between-items multidimensional partial credit model (PCM):
\[\ln\left[\frac{P_{i(d)j}(x)}{P_{i(d)j}(x - 1)}\right] = \sum_{k=0}^x \alpha_d (\theta_{dj} - \xi_{i(d)k})\]
where
\[P_{i(d)j}(x=0) = \frac{1}{\sum_{l=0}^{m_{i(d)}}\exp \sum_{k=0}^l(\alpha_d(\theta_d - \xi_{i(d)k}))}\]
\[\exp \sum_{k=0}^0 \alpha_d(\theta_{dj} - \xi_{i(d)k}) = 1\]
Another way to express the between-items multidimensional PCM:
\[\ln\left[\frac{P_{i(d)j}(x)}{P_{i(d)j}(x - 1)}\right] = \sum_{k=0}^x \alpha_d (\theta_d - \delta_{i(d)} - \tau_{i(d)k})\]
\(\delta_{i(d)}\): average item step parameter
\(\tau_{i(d)k}\): step deviation deviation
\(\xi_{i(d)k} = \delta_{i(d)} + \tau_{i(d)k}\)
Problem: \(\xi_{i(d)}\) parameters are not necessarily ordered
Alternative: Thurstone thresholds \(\lambda_{i(d)k}\)
\[\begin{align*} \tilde{\theta}_{d} & =r_{d}\theta_{d}+s_{d}\\ \tilde{\alpha}_{d} & =\alpha_{d}/r_{d}\\ \tilde{\delta}_{i(d)} & =r_{d}\delta_{i(d)}+s_{d}\\ \tilde{\tau}_{i(d)k} & =r_{d}\tau_{i(d)k}\\ \tilde{\xi}_{i(d)k} & =r_{d}\xi_{i(d)k}+s_{d} \end{align*}\]
Before, with dichotomous items:
Several quantities serve the role of the \(\delta_{i(d)}\) parameters
\(\xi_{i(d)k}\) are the quantities compared to \(\theta_d\) in the model equation
\(\delta_{i(d)}\) is an overall item location
\(\lambda_{i(d)k}\) indicates the location at which there is a .5 probability of a response in category \(k\) or higher
Sufficient statistics for the PCM, \(p_{i(d)k}\) are the proportions of responses that reach at least each score category \(k = 1, \ldots, m_{i(d)}\)
Within items, \(\xi_{i(d)k}\) not necessarily monotonically related to \(\hat{p}_{i(d)k}\)
Within items, \(\lambda_{i(d)k}\) necessarily monotonically related to \(\hat{p}_{i(d)k}\)
Working definition: dimensions are aligned if the same \(\hat{p}_{i(d)k}\) implies the same \(\lambda_{i(d)k}\) regardless of dimension
DDA method:
Fit data to both unidimensional and multidimensional models
Find a transformation so that, for each dimension, the mean and standard deviation of the multidimensional parameters equal the mean and standard deviation of the unidimensional parameters
What parameter to use? All of these have similarities to the \(\delta_{i(d)}\) of binary models.
\(\xi_{i(d)k}\)
\(\delta_{i(d)}\)
\(\lambda_{i(d)k}\)
Remember: DDA relies on a strong within-dimension relationship between \(\mathcal{M}\) and \(\mathcal{U}\)
Follow mostly the same procedures as before
DDA:
LRA:
Evaluate alignment as the absolute rank-order correlation between \(\hat{\boldsymbol{p}}\) and \(\hat{\boldsymbol{\lambda}}\), across dimensions
Kindergarten Individual Development Survey (KIDS)
Ratings taken on 59,429 kindergarten-age children in Illinois in spring 2015
6 rating categories per measure (item)
Domains:
ATL-REG: Approaches to Learning and Self-Regulation - 4 measures
SED: Social and Emotional Development - 4 measures
LLD: Langauge and Literacy Development - 10 measures
COG: Math - 10 measures
PDH: Physical Development and Health - 9 measures
Before alignment, \(\tau(\hat{\boldsymbol{p}}, \hat{\boldsymbol{\lambda}}) = .924\)
| ATL-REG | SED | LLD | COG | PDH | ||
|---|---|---|---|---|---|---|
| DDA \(\hat{\boldsymbol{\delta}}\) | \(\hat{r}\) | \(1.000\) | \(0.832\) | \(0.973\) | \(0.912\) | |
| \(\tau(\hat{\boldsymbol{p}}, \hat{\boldsymbol{\lambda}})=.971\) | \(\hat{s}\) | \(0.000\) | \(-0.005\) | \(0.006\) | \(0.009\) | |
| DDA \(\hat{\boldsymbol{\lambda}}\) | \(\hat{r}\) | \(1.000\) | \(1.097\) | \(1.071\) | \(1.034\) | |
| \(\tau(\hat{\boldsymbol{p}}, \hat{\boldsymbol{\lambda}})=.937\) | \(\hat{s}\) | \(0.000\) | \(0.092\) | \(0.098\) | \(0.071\) | |
| LRA | \(\hat{r}\) | \(1.000\) | \(1.095\) | \(1.081\) | \(1.050\) | |
| \(\tau(\hat{\boldsymbol{p}}, \hat{\boldsymbol{\lambda}})=.934\) | \(\hat{s}\) | \(0.000\) | \(0.072\) | \(0.100\) | \(0.129\) |
N = 500 subjects
5 or 20 items per dimension
Correlation between dimensions: {0, .5}
\(\text{sd}(\theta_2)\): {.5, 1, 2}
response categories per item: {3, 5, 7}
Percentage of missing data: {0%, 5%, 50%}
Note: can always perform all 3 methods and select which works “best”, i.e., leads to the highest absolute rank-order correlations between \(\hat{\boldsymbol{\lambda}}\) and \(\hat{\boldsymbol{p}}\)
Major findings:
DDA \(\hat{\boldsymbol{\lambda}}\) and LRA reliably led to increases in rank-order correlations, DDA \(\hat{\boldsymbol{\delta}}\) sometimes led to decreases in rank-order correlations
DDA \(\hat{\boldsymbol{\delta}}\) was the preferred method most often when \(\text{sd}(\theta_2) = 1\)
With 3 response categories, DDA \(\hat{\boldsymbol{\lambda}}\) tended to be selected most often
For 5 or 7 response categories, LRA tended to be selected most often
Absolute rank-order correlations generally higher for shorter tests and lesser proportions of missing data
No large effects associated with other manipulated factors
Summary:
A concrete definition of aligned scales
Evidence that alignment largely accounts for differences in latent trait distributions
2 methods to transform fitted models
Preliminary evidence that both methods are effective, at least in some datasets
Encouragement to test all methods, use the best results
Future directions:
Align while fitting the model, rather than a post-hoc transformation
Polytomous models with different numbers of response categories
Load in packages:
Read in example data and define Q matrix:
Fit initial model:
Align using DDA:
Align using LRA:
View transformation coefficients:
1 2
1.000000 1.086073
1 2
0.00000000 -0.00131837
[1] 1.000000 1.107417
[1] 0.00000000 -0.01236789
The aligned_DDA and aligned_LRA objects are TAM objects, so any function that can be used with TAM objects can be used with the aligned model.
View transformed difficulties:
before DDA LRA
1 0.20 0.20 0.20
2 0.46 0.46 0.46
3 0.71 0.71 0.71
4 0.80 0.80 0.80
5 0.68 0.68 0.68
6 0.77 0.77 0.77
7 1.23 1.23 1.23
8 -0.29 -0.29 -0.29
9 0.31 0.31 0.31
10 0.19 0.19 0.19
11 0.11 0.11 0.10
12 0.71 0.71 0.70
13 1.08 1.08 1.07
14 2.55 2.55 2.54
15 0.80 0.80 0.79
16 0.48 0.48 0.47
17 1.29 1.29 1.28
18 0.10 0.10 0.09
19 0.42 0.42 0.41
20 0.70 0.70 0.69
`
pid case pweight score max EAP.Dim1 SD.EAP.Dim1 EAP.Dim2 SD.EAP.Dim2
1 1 1 1 3 20 -1.1971525 0.6786371 -0.61888966 0.5687768
2 2 2 1 3 20 -1.1971525 0.6786371 -0.61888966 0.5687768
3 3 3 1 8 20 0.3438237 0.5649660 -0.01823904 0.5296354
4 4 4 1 14 20 1.2335182 0.5851602 1.17575120 0.5200049
5 5 5 1 5 20 -0.6874662 0.6227283 -0.22287505 0.5418169
6 6 6 1 1 20 -1.4276290 0.7120677 -1.32802377 0.6251602
pid case pweight score max EAP.Dim1 SD.EAP.Dim1 EAP.Dim2 SD.EAP.Dim2
1 1 1 1 3 20 -1.1971502 0.6786302 -0.67347797 0.6177026
2 2 2 1 3 20 -1.1971502 0.6786302 -0.67347797 0.6177026
3 3 3 1 8 20 0.3438138 0.5649624 -0.02110962 0.5752058
4 4 4 1 14 20 1.2335267 0.5851569 1.27556287 0.5647479
5 5 5 1 5 20 -0.6874625 0.6227231 -0.24339467 0.5884401
6 6 6 1 1 20 -1.4276486 0.7120637 -1.44357454 0.6789306
Feuerstahler, L. M., & Wilson, M. (2019). Scale alignment in between-item multidimensional Rasch models. Journal of Educational Measurement, 56(2), 280-301.
Feuerstahler, L. M., & Wilson, M. (2021). Scale alignment in the between-items multidimensional partial credit model. Applied Psychological Measurement, 45(4), 268-282.